Searching Multi-Rate and Multi-Modal Temporal Enhanced Networks for Gesture Recognition
نویسندگان
چکیده
Gesture recognition has attracted considerable attention owing to its great potential in applications. Although the progress been made recently multi-modal learning methods, existing methods still lack effective integration fully explore synergies among spatio-temporal modalities effectively for gesture recognition. The problems are partially due fact that manually designed network architectures have low efficiency joint of multi-modalities. In this paper, we propose first neural architecture search (NAS)-based method RGB-D proposed includes two key components: 1) enhanced temporal representation via 3D Central Difference Convolution (3D-CDC) family, which is able capture rich context aggregating difference information; and 2) optimized backbones multi-sampling-rate branches lateral connections varied modalities. resultant multi-rate provides a new perspective understand relationship between RGB depth their dynamics. Comprehensive experiments performed on three benchmark datasets (IsoGD, NvGesture, EgoGesture), demonstrating state-of-the-art performance both single- multi-modality settings. code available at https://github.com/ZitongYu/3DCDC-NAS .
منابع مشابه
Challenges in Multi-modal Gesture Recognition
This paper surveys the state of the art on multimodal gesture recognition and introduces the JMLR special topic on gesture recognition 2011-2015. We began right at the start of the KinectT Mrevolution when inexpensive infrared cameras providing image depth recordings became available. We published papers using this technology and other more conventional methods, including regular video cameras,...
متن کاملBayesian Co-Boosting for Multi-modal Gesture Recognition
With the development of data acquisition equipment, more and more modalities become available for gesture recognition. However, there still exist two critical issues for multimodal gesture recognition: how to select discriminative features for recognition and how to fuse features from different modalities. In this paper, we propose a novel Bayesian Co-Boosting framework for multi-modal gesture ...
متن کاملMulti-modal Integration for Gesture and Speech
Demonstratives, in particular gestures that “only” accompany speech, are not a big issue in current theories of grammar. If we deal with gestures, fixing their function is one big problem, the other one is how to integrate the representations originating from different channels and, ultimately, how to determine their composite meanings. The growing interest in multi-modal settings, computer sim...
متن کاملtight frame approximation for multi-frames and super-frames
در این پایان نامه یک مولد برای چند قاب یا ابر قاب تولید شده تحت عمل نمایش یکانی تصویر برای گروه های شمارش پذیر گسسته بررسی خواهد شد. مثال هایی از این قاب ها چند قاب های گابور، ابرقاب های گابور و قاب هایی برای زیرفضاهای انتقال پایاست. نشان می دهیم که مولد چند قاب تنک نرمال شده (ابرقاب) یکتا وجود دارد به طوری که مینیمم فاصله را از ان دارد. همچنین مسایل مشابه برای قاب های دوگان مطرح شده و برخی ...
15 صفحه اولHybridization of Facial Features and Use of Multi Modal Information for 3D Face Recognition
Despite of achieving good performance in controlled environment, the conventional 3D face recognition systems still encounter problems in handling the large variations in lighting conditions, facial expression and head pose The humans use the hybrid approach to recognize faces and therefore in this proposed method the human face recognition ability is incorporated by combining global and local ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE transactions on image processing
سال: 2021
ISSN: ['1057-7149', '1941-0042']
DOI: https://doi.org/10.1109/tip.2021.3087348